Massive Stochastic Testing of SQL

نویسنده

  • Donald R. Slutz
چکیده

Deterministic testing of SQL database systems is human intensive and cannot adequately cover the SQL input domain. A system (RAGS), was built to stochastically generate valid SQL statements 1 million times faster than a human and execute them. This paper describes RAGS and the results from turning it lose on several commercial SQL systems. 1. Testing SQL is Hard Good test coverage of commercial SQL database systems is very hard. The input domain, all SQL statements, from any number of users, combined with all states of the database, is gigantic. It is also difficult to verify output for positive tests because the semantics of SQL are complicated. Software engineering technology exists to predictably improve quality ([1] for example). The techniques involve a software development process including unit tests and final system validation tests (to verify the absence of bugs). This process requires a substantial investment so commercial SQL vendors with tight schedules tend to use a more ah-hoc process. The most popular method is rapid development followed by test-repair cycles. SQL test groups mainly use deterministic testing. They compose test scripts of SQL statements that cover individual features of the language and commonly used combinations of features. The scripts are continuously extended for added features and to verify bug fixes. If the test-repair cycles uncover particularly buggy areas, more detailed scripts for those areas are added. Typical SQL test libraries contain tens of thousands of statements and require an estimated 1⁄2 person-hour per statement to compose. These test libraries cover an important, but minute, fraction of the SQL input domain. Stochastic testing can be used to increase the coverage. Stochastic techniques are used to mix several deterministic streams for concurrency tests and for scaled up load testing. Large increases in test coverage must come from automating the generation of tests. This paper de1 SQL testing procedures and bug counts are proprietary so there is little public information. scribes a method to rapidly create a very large number of SQL statements without human intervention. The SQL statements are generated stochastically (or 'randomly') which provides the speed as well as wider coverage of the input domain. The challenge is to distribute the SQL statements in useful regions of the input domain. If the distribution is adequate, stochastic testing has the advantage that the quality of the tests improves as the test size increases [2]. A system called RAGS (Random Generation of SQL) was built to explore automated testing. RAGS is currently used by the Microsoft SQL Server[3] testing group. This paper describes how RAGS works and how it was evolved to be a more effective tool. We focus on positive tests in this paper, but mention other kinds of tests in the summary. Figure 1 illustrates the test coverage problem. Customers use the hexagon, bugs are in the oval, and the test libraries cover the shaded circle. 2. The RAGS System The RAGS approach is: 1. Greatly enlarged the shaded circle in Figure 1 by stochastic SQL statement generation. 2. Make all aspects of the generated SQL statements configurable. 3. Experiment with configurations to maximize the bug detection rate. Used by customers 1 2 3 All possible SQL statements and database states Detectable software bugs Input Domain SQL test library coverage Figure 1:SQL test library coverage should include at least region 2. Unfortunately, we don’t know the actual region boundaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Power of surrogate data testing with respect to nonstationarity

Surrogate data testing is a method frequently applied to evaluate the results of nonlinear time series analysis. Since the null hypothesis tested against is a linear, Gaussian, stationary stochastic process a positive outcome may not only result from an underlying nonlinear or even chaotic system, but also from, e.g., a nonstationary linear one. We investigate the power of the test against nons...

متن کامل

Testing for Stochastic Non- Linearity in the Rational Expectations Permanent Income Hypothesis

The Rational Expectations Permanent Income Hypothesis implies that consumption follows a martingale. However, most empirical tests have rejected the hypothesis. Those empirical tests are based on linear models. If the data generating process is non-linear, conventional tests may not assess some of the randomness properly. As a result, inference based on conventional tests of linear models can b...

متن کامل

A New Method for Characterization of Biological Particles in Microscopic Videos: Hypothesis Testing Based on a Combination of Stochastic Modeling and Graph Theory

Introduction Studying motility of biological objects is an important parameter in many biomedical processes. Therefore, automated analyzing methods via microscopic videos are becoming an important step in recent researches. Materials and Methods In the proposed method of this article, a hypothesis testing function is defined to separate biological particles from artifact and noise in captured v...

متن کامل

Automatic Detection of Vulnerabilities in Web Applications using Fuzzing

Automatic detection of vulnerabilities is a problem studied in literature and a very important concern in application development with security requirements. Fuzzing is a software testing technique, automated or semi-automated, that involves injecting a massive quantity of semi-random inputs in software in order to find security vulnerabilities. Many vulnerability detection techniques need manu...

متن کامل

Testing for Restricted Stochastic Dominance

Asymptotic and bootstrap tests are studied for testing whether there is a relation of stochastic dominance between two distributions. These tests have a null hypothesis of nondominance, with the advantage that, if this null is rejected, then all that is left is dominance. This also leads us to define and focus on restricted stochastic dominance, the only empirically useful form of dominance rel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998